|
IBM alignment models are a sequence of increasingly complex models used in statistical machine translation to train a translation model and an alignment model, starting with lexical translation probabilities and moving to reordering and word duplication. They have underpinned the majority of statistical machine translation systems for almost twenty years. These models offer principled probabilistic formulation and (mostly) tractable inference. The original work on statistical machine translation at IBM proposed five models, and a model 6 was proposed later. The sequence of the six models can be summarized as: * Model 1: lexical translation * Model 2: additional absolute alignment model * Model 3: extra fertility model * Model 4: added relative alignment model * Model 5: fixed deficiency problem. * Model 6: Model 4 combined with a HMM alignment model in a log linear way == Model 1 == IBM Model 1 is weak in terms of conducting reordering or adding and dropping words. In most cases, words that follow each other in one language would have a different order after translation, but IBM Model 1 treats all kinds of reordering as equally possible. Another problem while aligning is the fertility (the notion that input words would produce a specific number of output words after translation). In most cases one input word will be translated into one single word, but some words will produce multiple words or even get dropped (produce none words at all). The fertility of word models addresses this aspect of translation. While adding additional components increases the complexity of models, the main principles of IBM Model 1 are constant. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「IBM alignment models」の詳細全文を読む スポンサード リンク
|